NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / sys / amiga / programmer / 3862 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 1.8 KB

Path: hydra.zrz.TU-Berlin.DE!rawneiha From: rawneiha@hydra.zrz.TU-Berlin.DE (Philipp Boerker) Newsgroups: comp.sys.amiga.programmer Subject: Re: Texture/Gouraud innerloop speedtests Date: 19 Feb 1996 14:38:55 GMT Organization: Technical University Berlin, Germany Message-ID: <4ga21v$lsk@brachio.zrz.TU-Berlin.DE> References: <38232464@kone.fipnet.fi> NNTP-Posting-Host: hydra.zrz.tu-berlin.de "Jyrki Saarinen" <jsaarinen@kone.fipnet.fi> writes: >Ok, I did a little research. My CPU is a 40MHz 68040, >a Warp Engine with a very fast memory system, maybe >this is the reason I did not gain any speed even if >I turned the data cache and thus data burst off, >with data burst everything was about 50% slower. Not very surprising! Data burst means that whenever a cache-miss occurs the CPU loads 4 longwords around the mem area where the data to be fetched is. For a tmapping loop this means that for almost any pixel that is fetched from the texture the CPU keeps the bus busy for 4 mem cycles! >So the frame rates were for a 320x256 screen: >Texture/Gouraud/Shading table, 64k aligned: ~43 fps >Plain Texture, 64k aligned: ~67 fps fps? Are these figures for the mere repetition (320*256 times) of the innerloop? > move.b (a3),d1 > move.l d1,a3 > add.l a2,a1 > move.b (a3),(a0)+ > dbf d7,poly > rts >The places were schedeling was most effective were the > move.l d0,a3 > <something here is a must> > move.b (a3),d1 > <or> > move.b (a3),(a0)+ > move.b (a3,d0.l),d1 > move.b (a4,d1.l),(a0)+ [...] > dbf d7,poly > rts If I understand your problem right you wonder why the two version are almost equal in terms of speed? The scheduling is not optimal in both versions, you use the data that you fetch in the next instruction. Greets, Phil. grond/matrix